large language model

The glossary is being gradually proof checked, but currently has many typos and misspellings.

Large-language models, as used in ChatGPT, have been one of the defining areas of the 'new AI'. At their simplest they are word predictors, taking a text and attempting to predict the next word that will appear. If this is repeated then whole new texts can be created.

LLMs build on the success of simple statistical methods such as n-grams which, when trained on very large corpera, were found to be 'unreasonably effective' at tasks that had previously been thought to require complex natural language processing. LLMs leverage the same big data, from web documents, media feeds, social media and forums, but use deep neural networks, which appear able to identify higher levels of meaning such as topics.

The addition of attention mechanisms in transformer models has allowed LLMs to make use of long-term patterns in language, such as understanding the referants of pronouns, or returning to previous topics. The text and chat's produced by LLMs can be indistinguishable from those with humans and can thus be argued to be passing the Turing test.

Used in Chap. 1: page 7; Chap. 13: pages 188, 201, 202; Chap. 16: page 248; Chap. 22: page 351; Chap. 23: pages 358, 359, 366, 370; Chap. 24: pages 375, 376

Also known as LLM

large language model

Terms from Artificial Intelligence: humans at the heart of algorithms

Links: